Complex Predicates in Indian Language Wordnets

نویسندگان

  • Pushpak Bhattacharyya
  • Debasri Chakrabarti
  • Vaijayanthi M. Sarma
چکیده

Wordnets, which are repositories of lexical semantic knowledge containing semantically linked synsets and lexically linked words, are indispensable for work on computational linguistics and natural language processing. While building wordnets for Hindi and Marathi, two major IndoEuropean languages, we observed that the verb hierarchy in the Princeton Wordnet was rather shallow. We set to constructing a verb knowledge base for Hindi, which arranges the Hindi verbs in a hierarchy of is-a (hypernymy) relation. We realized that there are unique Indian language phenomena that bear upon the lexicalization vs. syntactically derived choice. One such example is the occurrence of conjunct and compound verbs (called Complex Predicates) which are found in all Indian languages. This paper presents our experience in the construction of lexical knowledge bases for Indian languages with special attention to Hindi. The question of storing or deriving complex predicates has been dealt with linguistically and computationally. We have constructed empirical tests to decide if a combination of two words, the second of which is a verb, is a complex predicate or not. Such tests will provide a principled way of deciding the status of complex predicates in Indian language wordnets. An additional application of this work is the possibility of automatic augmentations to the Wordnet using corpora, a topic of great interest in current research.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Interlanguage of Persian Learners of Italian: a Focus on Complex Predicates

This paper aims at investigating the acquisition of Italian complex predicates by native speakers of Persian. Complex predication is not as pervasive a phenomenon in Italian as it is in Persian. Yet Italian native speakers use complex predicates productively; spontaneous data show that Persian learners of Italian seem to be perfectly aware of Italian complex predicates and use this familiar fea...

متن کامل

Sophisticated Lexical Databases - Simplified Usage: Mobile Applications and Browser Plugins For Wordnets

India is a country with 22 officially recognized languages and 17 of these have WordNets, a crucial resource. Web browser based interfaces are available for these WordNets, but are not suited for mobile devices which deters people from effectively using this resource. We present our initial work on developing mobile applications and browser extensions to access WordNets for Indian Languages. Ou...

متن کامل

Extending Wordnets To Implicit Information

WordNets mostly deal with lexicalized expressions and lexical-semantic relations among them. Concepts are represented by sets of synonyms (synsets), which constitute the edges of the network. Each synset includes the lexicalized expressions that correspond to a given concept. This paper adduces evidences which support the claim that some concepts, expressed by a subtype of complex telic predica...

متن کامل

The Representation of Complex Telic Predicates in Wordnets: the Case of Lexical-Conceptual Structure Deficitary Verbs

This paper has a twofold aim: (i) to point out that telicity is both a lexical and a compositional semantic feature; (ii) to propose a straightforward solution to represent lexical telicity in wordnets-like computational lexica. The approach presented here subsumes the basic idea that lexicon is not a repository of idiosyncrasies. It is rather organized following a few general (universal or par...

متن کامل

IndoWordNet and its Linking with Ontology

Reasoning about natural language requires combining semantically rich lexical resources with world knowledge, provided by ontologies. In this paper, we describe linking of WordNets of Indian languages with an upper ontology SUMO (Suggested Upper Merged Ontology). This creates multilingual resource for Indian languages which can be used in various natural language processing applications. This p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007